HalluciNet-ing Spatiotemporal Representations Using a 2D-CNN
نویسندگان
چکیده
Spatiotemporal representations learned using 3D convolutional neural networks (CNN) are currently used in state-of-the-art approaches for action-related tasks. However, 3D-CNN notorious being memory and compute resource intensive as compared with more simple 2D-CNN architectures. We propose to hallucinate spatiotemporal from a teacher student. By requiring the predict future intuit upcoming activity, it is encouraged gain deeper understanding of actions how they evolve. The hallucination task treated an auxiliary task, which can be any other multitask learning setting. Thorough experimental evaluation, shown that indeed helps improve performance on action recognition, quality assessment, dynamic scene recognition From practical standpoint, able without actual enable deployment resource-constrained scenarios, such limited computing power and/or lower bandwidth. also observed our has utility not only during training phase, but pre-training phase.
منابع مشابه
SURF-ing a Model of Spatiotemporal Saliency
Zhai and Shah (2006) proposed a model of spatiotemporal saliency using a combination of temporal and spatial attention models. The temporal model utilized Lowe’s SIFT (2004) to compute feature points and the correspondences between them in successive frames. Bay, Tuytelaars, & Van Gool introduced SURF (2006) as an alternative feature detector and descriptor. The authors of SURF show that it is ...
متن کاملCNN Technology for Spatiotemporal Signal Processing
Copyright © 2009 David L ´ opez Vilariño et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Cellular Neural Networks (CNNs) are a paradigm for non-linear spatial-temporal dynamics and the core of the Cellular Wave Computing (a...
متن کاملMatching 3D Shapes Using 2D Conformal Representations
Matching 3D shapes is a fundamental problem in Medical Imaging with many applications including, but not limited to, shape deformation analysis, tracking etc. Matching 3D shapes poses a computationally challenging task. The problem is especially hard when the transformation sought is diffeomorphic and non-rigid between the shapes being matched. In this paper, we propose a novel and computationa...
متن کاملVisual Language Modeling on CNN Image Representations
Measuring the naturalness of images is important to generate realistic images or to detect unnatural regions in images. Additionally, a method to measure naturalness can be complementary to Convolutional Neural Network (CNN) based features, which are known to be insensitive to the naturalness of images. However, most probabilistic image models have insufficient capability of modeling the comple...
متن کاملMarginalized CNN: Learning Deep Invariant Representations
Training a deep neural network usually requires sufficient annotated samples. The scarcity of supervision samples in practice thus becomes the major bottleneck on performance of the network. In this work, we propose a principled method to circumvent this difficulty through marginalizing all the possible transformations over samples, termed as marginalized Convolutional Neural Network (mCNN). mC...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Signals
سال: 2021
ISSN: ['2624-6120']
DOI: https://doi.org/10.3390/signals2030037